This file describes the components of the replication package for "Investigator Characteristics and Respondent Behavior in Online Surveys"
Ariel White, Anton Strezhnev, Dominika Kruszewska, Christopher Lucas, Connor Huff
October 2017


There are three datasets included:
"data_investigator_characteristics.csv" is the data from the study reported in the main paper.  
"data_investigator_characteristics_second_experiment.csv" is the data from an earlier study discussed in Section H of the SI.
"MTurk_postings.csv" is the result of a search (described in main paper text) of MTurk postings from various academic disciplines. 


There are two scripts that replicate analyses from the main paper and SI:
"Main_Replication_Code.R" recreates all figures and tables from the main paper, as well as a number of figures and tables from the SI.
"Secondary_Replication_Code.R" recreates Figures A8-A11 from Section J of the SI (reporting another study we ran, randomizing only the researcher name shown on the consent form).   



Variable descriptions:

data_investigator_characteristics.csv includes 91 variables, described below.
1 ResponseID is a unique Qualtrics-generated identifier for each respondent
2 ID1Number is a hashed version of user's MTurk ID
3 ID2Number is a hashed version of user's IP address
4 ResponseSet is a Qualtrics bin 
5-6 StartDate/EndDate report the times respondent began/finished the survey
7 Finished is an indicator (1=yes) for whether the respondent completed the whole survey
8 researcherName captures the treatment condition the respondent saw
9 writePrompt is 2 for all respondents; due to a survey setup error, no respondents received a writing prompt in this study
10 consent is an indicator (1=yes) for whether the respondent noted their consent to participate in the study
11 womenrole captures respondents' stated beliefs about women's role (equality with men = 1, in the home = 2, haven't thought much about it = 3)
12-20 "president.GROUP" each contain respondents' reported willingness to vote for a member of [GROUP] for president (1=yes)
21 services indicates respondents' views about whether government should provide more (2) or fewer (1) services
22 education is an 8-level measure of educational attainment (see SI for values)
23 income is a 9-level measure of yearly household income (see SI for values)
24-40 "ac1" variables report respondents' selection(s) on our first attention check question (see full text in SI)
42-65 "ac2" variables report respondents' selection(s) on our first attention check question (see full text in SI)
66-69 "blacks" variables capture respondents' answers to the four-question "racial resentment" battery taken from the ANES
70 freeresponse.WomanPresident was an unused variable that would have captured responses to our (accidentally not given) writing prompt
71 freeresponse.PoliticsLife was an unused variable that would have captured responses to our (accidentally not given) writing prompt
72 turnout is self-reported voter turnout in 2012 (3=yes, 2=no, 1=don't remember)
73 vote.choice is self-reported vote choice in 2012 (1 = Romney, 2= Obama, 3=other)
74 vote.choice.text is self-reported (write-in) 2012 vote choice if they selected "other" in the prior question
75 party is reported partisanship (1=D, 2=R, 3=Independent)
76 partyln captures whether respondents lean toward one party if they selected Independent on the prior question (1=lean Republican, 2=lean Democratic)
77-83 "ethnicity" variables capture respondents' self-reported race/ethnicity
84 gender captures respondent gender (1=male, 2=female, 3=other)
85 gender.text captures respondent free-text report of gender if they selected "other" on the prior question
86 purpose captures respondents' (free-text) guesses about the purpose of our study
87-91 "behavior" variables capture whether respondents reported doing any of the listed activities (such as listening to music) while completing the survey 

data_investigator_characteristics_second_experiment.csv includes 91 variables.  All variables with the same names as the previous file are as described above, with two exceptions noted below.  Several additional variables are as follows:
8 MTurkCode is the code we generated for respondents to report in MTurk for payment
9 writePrompt captures which writing prompt respondents were assigned (0 = WomanPresident, 1= PoliticsLife) 
69 freeresponse.WomanPresident captures respondents' answers to a writing prompt (assigned in "writePrompt") about what they thought of the idea of having a woman president
70 freeresponse.PoliticsLife captures respondents' answers to a writing prompt (assigned in "writePrompt") about a time that politics affected their life
91 display.order.questions captures the order in which questions were displayed to respondents

MTurk_postings.csv contains 4 columns:
Search Term is the search term used to find postings on MTurk (Politics, Economics, etc.)
Accounts Using Real Names is the count of MTurk postings found with that search term that used an apparently real researcher name to post the task
Total Accounts is a count of the number of accounts that posted a task using the search term
Proportion Using Name is column 2 over column 3

